Ford GoBike Dataset Exploration

Table of Contents

Introduction

first we are going to import the necessary liberaries

Data Wrangling

1. Discovering

2. Structuring

first we gonna extract member age from member_birth_year feature and store as new column member_age

now we gonna segregate start_time column into years, monthes, weekdays, days and hours, neglecting minutes and seconds as it will not really affect our analysis.

now we gonna segregate end_time column into years, monthes, weekdays, days and hours, neglecting minutes and seconds as it will not really affect our analysis.

3. Cleaning

since we don't have missing values in latitude/longitude columns we can replace missing values in station name/id with a made up ones.

now we gonna drop start_time, end_time and member_birth_year as they no longer needed.

as we can not determaine the member gender for the missing values using other features we gonna drop these rows.

finally we change the float data type for start_station_id, end_station_id and member_age columns to more suitable data type int.

4. Enriching

we gonna skip this step as no need for Enriching at the time.

5. Validating

we gonna skip this step as no need for validation.

6. Publishing

now that everything is set we are ready to save our data.

Exploratory Data Analysis

1. Univariate Exploration

2. Bivariate Exploration

3. Multivariate Exploration

Conclusions

1. Results

2. Limitations